Pseudoparticle Multipole Method: A Simple Method to 
Implement High- Accuracy Treecode 
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ABSTRACT 

In this letter we describe the pseudoparticle multipole method (P 2 M 2 ), a new method to 
express multipole expansion by a distribution of pseudoparticles. We can use this distribution of 
particles to calculate high order terms in both the Barnes-Hut treecode and FMM. The primary 
advantage of P 2 M 2 is that it works on GRAPE. GRAPE is a special-purpose hardware for the 
calculation of gravitational force between particles. Although the treecode has been implemented 
on GRAPE, we could handle terms only up to dipole, since GRAPE can calculate forces from 
point-mass particles only. Thus the calculation cost grows quickly when high accuracy is required. 
With P 2 M 2 , the multipole expansion is expressed by particles, and thus GRAPE can calculate 
high order terms. Using P 2 M 2 , we implemented an arbitrary-order treecode on GRAPE-4. Tim- 
ing result shows GRAPE-4 accelerates the calculation by a factor between 10 (for low accuracy) 
to 150 (for high accuracy). Even on general-purpose programmable computers, our method offers 
the advantage that the mathematical formulae and therefore the actual program is much simpler 
than that of the direct implementation of multipole expansion. 
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The calculation of the gravitational force is the 
most expensive part of almost all iV-body simu- 
lations. The Barnes-Hut treecode (Barnes & Hut 
1986) is a widely used algorithm to reduce the cost 
of the force calculation. In the treecode, forces on 
a particles from distant particles are replaced by 
multipole expansions of groups of particles. More 
distant particles are organized into larger groups, 
so that the truncation error of the expansion is 
similar everywhere. Hierarchical tree structure is 
used to form grouping efficiently The calculation 
cost is reduced from 0(N 2 ) to 0(N log AT). 



Even with the treecode, the cost of force cal- 
culation is still high, and it dominates the to- 
tal calculation cost. In order to accelerate the 
treecode further, we can use GRAPE (Sugimoto 
et al. 1990; Makino & Taiji 1998). GRAPE is 
a special-purpose hardware for the calculation of 
gravitational force between particles. It works 
in cooperation with a general-purpose computer 
(host computer). The host computer does every- 
thing except for the force calculation. The appli- 
cation of GRAPE to the treecode is introduced by 
Makino (1991), who implemented Barnes' (1990) 
modified algorithm on GRAPE-lA (Fukushige et 
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al. 1991). Athanassoula et al. (1998) and Kawai et 
al. (2000) reported its performance on GRAPE-3 
and GRAPE-5, respectively. The speedup factor 
they obtained is in the range of 30 to 50. 

Although GRAPE can significantly accelerate 
the treecode, its application has been limited to 
simulations where the requirement for the accu- 
racy is modest. Since GRAPE can only calcu- 
late forces from point mass particles, we have not 
been able to handle terms of the multipolc expan- 
sion higher than dipole. Thus, the calculation cost 
grows quickly when high accuracy is required. 

In this letter, we introduce the pseudoparticle 
multipole method (P 2 M 2 ) which makes it possible 
to evaluate higher-order expansions on GRAPE. 
In P 2 M 2 , multipolc expansions are represented by 
a small number of pseudoparticles. The masses 
and positions of pseudoparticles are determined 
so that they have the same expansion coefficients 
as the corresponding physical particles up to the 
specified order. With the P 2 M 2 , we can use 
GRAPE to evaluate high order terms, since they 
are expressed by particles. 

In this letter, we first describe the procedure to 
express quadrupole moment using three particles 
(§2). Then we briefly describe the generalization 
to higher order expansion (§ 3). Finally we give 
the result of numerical tests on GRAPE-4 (§ 4, 
§5). 

2. Quadrupole Moment with Pseudoparti- 
cles 

In P 2 M 2 , we distribute the pseudoparticles so 
that they exactly reproduce the coefficients of the 
multipole expansion of real (physical) particles up 
to a given order. Conceptually, in order to obtain 
such a distribution, first we calculate the expan- 
sion coefficients from the distribution of physical 
particles. We then solve the inverse problem to ob- 
tain the masses and positions of pseudoparticles. 
In the following, we describe a practical procedure 
to obtain the distribution of the pseudoparticles 
which can be used up to quadrupoles. 

In Cartesian coordinates, the multipole expan- 
sion up to quadrupoles of the potential due to N 



particles is expressed as 
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The mass Mj and the position Rj of pseudoparti- 
cles must be determined so that they give the same 
$ up to a given order p. In general, the expan- 
sion up to the p-th order has (p + l) 2 independent 
terms. Since each pseudoparticle has four degrees 
of freedom (one for mass and three for position), 
theoretically we can reproduce the expansion us- 
ing K m i n (p) = \(p+ l) 2 /4] pseudoparticles. Here 
\x] denotes the minimum integer not smaller than 
x. 

In order to express multipole expansion of order 
p = 0, we need -Kmin(O) = 1 pseudoparticle. We 
must put the particle so that Mi and Ri repro- 
duce the first term (monopole term) of the right 
hand side of equation (1). For this purpose we 
can set the mass Mi = M, where M is the total 
mass of physical particles. At least formally, the 
position of the pseudoparticle can be anywhere. In 
figure 1 we place them at the origin as example. In 
practice, we would never use zeroth order expan- 
sion since we can achieve the first order accuracy 
with one particle, as we will see below. 
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Fig. 1. — The positions of pseudoparticles which 
reproduce the multipole expansion up to monopole 
(left), dipole (middle), and quadrupole (right). 

For p = 1 , Mj and Rj must reproduce the first 
and the second term (dipole term) of the right 
hand side of equation (1). We can satisfy this 
condition by placing a single pseudoparticle with 
mass M at the center of mass of physical particles, 
r cm (see figure lb), as is done with the original 
Barnes-Hut treecode. 

For p = 2, the minimum number of pseudopar- 
ticles necessary is A" m i n (2) = 3. In the following, 
we'll see whether we can actually construct the 
distribution of three pseudoparticles which repro- 
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duces the multipole expansion up to quadrupole. 

In order to reproduce the first and the second 
terms, the total mass of the pseudoparticles should 
be M, and their center of mass should be located 
at r cm . This is achieved by making their total 
mass M and center of mass to be the same as that 
of physical particles. In the following, we use a 
coordinate system shifted so that r cm = 0. 

Pseudoparticles should have the same quadrupole 
tensor 
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as physical particles to reproduce the third term 
in equation (1). 

By definition, A is symmetric and traceless. 
Therefore we can choose the coordinate axe so that 
A is diagonalized. In this coordinate system, A is 
expressed as: 
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U -(ai + a 2 ) 

We can choose ai and a 2 , so that the relation 

ai > a 2 > -(ai + o 2 ) (4) 

is satisfied. 

Obviously, all three pseudoparticles should be 
on the x-y plane. Now our problem is reduced 
to determining the masses and positions of three 
particles on the x-y plane so that they give the 
quadrupole moment tensor of the form (3). 

There are a variety of ways to satisfy this re- 
quirement. Here we give just one example. 

We still have three extra degrees of freedom, 
since we can change masses and positions of two 
particles on the x-y plane freely, and we have only 
three constraint. In our procedure, we set the 
masses as M x = M 2 = M 3 = M/3. These masses 
reproduce the first term of the equation (1). Now 
we have only one extra degree of freedom left. We 
use it by setting x component of JSi to 0. Now we 
can determine the position vectors as follows: 
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where a and (3 are defined as 

a = ^(2 ai +a 2 )/M, [3 = y/( ai + 2a 2 )/(3M). 

(6) 

Note that both a and f3 are guaranteed to be real 
numbers. As we have already mentioned, this so- 
lution is not unique. For example, if we set y, 
instead of x, component of Rx to 0, we obtain 
another solution that attains the same order of 
accuracy. 

3. Higher Order Generalization: Use of 
Spherical Design 

Here we describe the generalization of P 2 M 2 for 
multipole expansion of higher order. Using the 
procedure we described in the previous section, we 
can express multipole expansion up to quadrupole, 
with the minimum number of pseudoparticles the- 
oretically required. However, the described proce- 
dure depends closely on the nature of quadrupolc- 
moment tensor, and it is difficult to extend this 
procedure to higher orders. 

Makino (1999) proposed a different approach 
based on the orthogonality of spherical harmonics. 
His approach gives a systematic procedure to ob- 
tain the distribution for an arbitrary order. Since 
his procedure needs rather large number of pseu- 
doparticles, it is not efficient when the required 
accuracy is modest. But it may be useful for cal- 
culations that require very high accuracy. In the 
following, we briefly describe his procedure. Here 
we use spherical coordinates for mathematical con- 
venience. 

The multipole expansion of the potential $(r) 
is expressed as 



$(r) 
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in spherical coordinates r — (r,6,<p). Here 
Y{ n {6 1 (f)) is the spherical harmonics and a™ are 
the expansion coefficients. In order to approxi- 
mate the potential field due to the distribution of 
N particles, the coefficients should satisfy 
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where m, and r, = (r, , 0j , fa ) are the masses and 
positions of the particles, and * denotes the com- 
plex conjugate. In order to express the expansion 
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<&(r) up to the p-th order, K pseudoparticles must 
give the same coefficients aj™ f° r au (p + I) 2 com- 
binations of I and m in the range of < I < p 
and —l<m<l. Thus, their masses Mj and the 
positions Rj = (Rj,9j,(f)j) should satisfy 

N K 

]T mi r\Yr\^ 4>i) = E M j R l j Y l m *(9 j , 

i=l j=l 

(9) 

for < 2 < p and —l<m<l. The mass Mj 
and the position i?j are obtained as the solution 
of this equation. 

As we have seen in the previous section, it is 
possible to directly solve this equation for rela- 
tively small p, say, p < 2. For large p, it is difficult 
because the equation is nonlinear. In addition, it 
is not clear whether or not an acceptable solution 
exists for arbitrary distribution of physical parti- 
cles. 

The key idea of Makino's approach is to fix the 
positions of pseudoparticles. Equation (9) is non- 
linear for the positions Rj, but is linear for the 
masses Mj. Thus, if we fix the positions, the equa- 
tion becomes linear. On the other hand, the neces- 
sary number of pseudoparticles increases, since we 
can adjust only the masses of pseudoparticles, and 
the degree of freedom assigned to each particle is 
reduced from four to one. 

We restrict the distribution of pseudoparticles 
to the surface of a sphere centered at the origin, 
so that the the mathematics are further simplified. 
With this restriction, the coefficients of multipolc 
expansion are expressed as the spherical harmonic 
expansion of the masses of pseudoparticles. If we 
choose the positions of pseudoparticles so that the 
numerical integration over their positions retains 
the orthogonality of spherical harmonics up to the 
p-th order, the mass of pseudoparticles are ob- 
tained as the inverse transform of the expansion 
through numerical integration. 

In Makino's procedure, the spherical t-design 
(McLaren 1963; Hardin & Sloane 1996) is used 
as the distribution of pseudoparticles. The spher- 
ical i-design is defined as a set of points on a 
unit sphere with the following characteristics. The 
summation of an arbitrary polynomial of a degree 
at most t over the points exactly match the inte- 
gration over the sphere (except for a constant co- 
efficient). Using spherical i-design, we can achieve 



expansion order p = [t/2\. Here [^J denotes the 
maximum integer not larger than x. For example, 
multipole expansion of order p — 1,2,3 and 4 are 
expressed with number of points K = 4, 12, 24 and 
36, respectively. See Hardin & Sloane (1996) for 
the number of points for higher p, and the position 
coordinates of the points. 

4. Numerical Tests 

Using P 2 M 2 , we implemented an arbitrary- 
order treecode (hereafter P 2 M 2 treecode) on 
GRAPE-4 (Makino et al. 1997; Makino & Taiji 
1998) and a UNIX workstation. The source code 
for these implementations are available upon re- 
quest. In the following we show the results of 
numerical tests for the accuracy and the per- 
formance. For the measurement, we used one 
GRAPE-4 processor board (47 processors, 15 
Gflops). The host computer for GRAPE-4 is 
a COMPAQ AlphaStation XP1000 with Alpha 
21264 processor, running at 500 MHz. 

We uniformly distributed 262144 equal-mass 
particles within a unit sphere. Then we calculated 
the force on each particle with P 2 M 2 treecode, and 
measured relative error e and calculation time T 
for various values of the opening angle 9. The er- 
ror e is defined as r.m.s. relative difference from 
the exact force. As the exact force, we used the 
force calculated with direct summation algorithm 
on the host computer using IEEE-754 standard 
64-bit arithmetic. 

Figures 2 and 3 show the results. In figure 2, the 
errors for P 2 M 2 treecode of orders p = 1 through 
4 are plotted as functions of 9. We can see that 
the error roughly scales as 9 P+15 . This behavior 
agrees well with the theoretical estimate given by 
Makino (1990). The saturation of the accuracy 
at e w 10~ 7 is due to the hardware limitation of 
GRAPE-4. 

In figure 3, the force error e is plotted as func- 
tions of the calculation time T. We can see that 
P 2 M 2 treecode of order p = 2 is the fastest for a 
wide range of accuracy (10~ 7 < e < 10~ 3 ). For 
low accuracy (e > 10~ 3 ), the treecode of order 
p = 1 is the fastest. For extremely high accu- 
racy (e < 10~ 7 ), P 2 M 2 treecode of order p = 4 
is likely to be the fastest, if the accuracy is not 
limited by the hardware. We performed the same 
test without GRAPE-4, and found that GRAPE- 
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Fig. 3. — The r.m.s. relative force error e plotted against the calculation time T. Meaning of the symbols 
are the same as those in figure 2. 
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Fig. 2. — The r.m.s. relative force error e plot- 
ted against the opening angle 9. Solid curve with 
filled squares shows the result of P 2 M 2 treecode of 
order p = 2. Four long-dashed curves with open 
triangle, square, pentagon, and hexagon are for 
generalized P 2 M 2 treecode of p = 1,2,3, and 4, 
respectively. Solid curve with filled triangles is for 
treecode of order p=l. 



4 accelerates the calculation by a factor of 10 (for 
e w 1(T 2 ) to 150 (for e « 10~ 6 ). 

5. An Example of Simulation 

Here we discuss the overall accuracy of the sim- 
ulation with P 2 M 2 treecode on GRAPE-4. We 
have already seen our code attain high accuracy 
in calculation of instantaneous force, but this high 
accuracy does not necessarily guarantee the accu- 
racy of the total simulation. For example, if the 
force errors in consecutive timesteps are strongly 
correlated, they accumulate. As a result, over- 
all accuracy of the orbits of stars might be worse, 
compared to the case with weaker correlation. 

We performed a simulation of the collision of 
two identical galaxies. We chose the system of 
units so that the total mass of each galaxy and 
the gravitational constant are both unity and the 
total energy of each galaxy is —1/4. Galaxy model 
we used is the Hernquist model (Hernquist 1990) 
with 65536 equal-mass particles. We cut off the 
distribution at radius 20. We place the galaxies at 
initial separation 3.0, and gave initial velocities so 
that they would follow a parabolic head on colli- 
sion. We integrated this system up to t = 16 with 
time step At = 1/200 and softening parameter 
e = 1/100. We performed the same simulation us- 
ing three different codes, namely, P 2 M 2 treecode 
of order p = 2, the treecode of order p=l, and the 
direct summation algorithm. For treecode simula- 
tions, the opening angle 9 is 0.5. 

Figure 4 shows the results. The relative error 
of the total energy, dE(t)/E(0), is plotted as a 
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function of time t. For all runs, we can see three 
bumps around t = 2,7, and 8, which correspond 
to close encounters of two galaxies. These bumps 
comes from the time-integration error, since they 
can be seen in the result of the direct summation 
algorithm with highly accurate force calculation. 
We can regard the deviation from the result of the 
direct summation as being caused by the error of 
the force calculated with the treecodes. For the 
treecode with p = 2, the maximum deviation from 
the direct summation is 5 x 10~ 5 , while for p = 1 
it is 4 x 10~ 4 . This a factor of 10 difference is 
consistent with the difference of the accuracy of 
the instantaneous force shown in figure 2. Thus, 
we can conclude that the high accuracy in the in- 
stantaneous force offered by our new P 2 M 2 does 
improve the overall accuracy of the time integra- 
tion. 
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Fig. 4. — The relative error of the total energy 
plotted as a function of time. Solid, long-dashed 
and short-dashed curves are for treecodes with 
p = 2, p = 1, and direct summation algorithm, 
respectively. 
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